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Abstract. We consider optimization problems constrained by partial 
differential equations (PDEs) with additional constraints placed on the 
solution of the PDEs. We develop a general and versatile framework us- 
ing infinite-valued penalization functions and Clarke subgradients and 
apply this to problems with box constraints as well as more general 
constraints arising in applications, such as constraints on the average 
value of the state in subdomains. The framework also allows for prob- 
lems with discontinuous data in the constraints. We present numerical 
results of this algorithm for the elliptic case and compare with other 
state-constrained algorithms. 

1. Introduction 

In recent years there has been intense research in the discussion of con- 
strained minimization problems with partial differential equations. In par- 
ticular, in the context of state constraints different methods have been pro- 
posed and analysed, see for example [I6l[2l[25lll0l[5l[171[3ll[l0l[371[2ll 
[Ml ESI Eg. We refer in particular to [l8l [281 [2D] for a survey on state 
constrained elliptic problems as appearing in the numerical results later on. 
The latter includes a discussion of the specific case of linear elliptic prob- 
lems with general constraints and related issues regarding regularity. Due 
to the possibly low regularity of the arising multipliers in the several cases 
we consider purely primal methods for solving state constraint problems. 

Our interest is in using the calculus associated with the proximal sub- 
gradient and the Clarke subgradient to derive optimality conditions for a 
variety of state-constrained optimization problems which are more general 
than standard box-constrained problems, with motivations from several ap- 
plications, particularly that of radiation therapy treatment planning prob- 
lems (such as those described in [33]). We use exact penalization methods 
without smoothing, leading to the need for elements of nonsmooth analy- 
sis. The constraints can include requirements on both the weighted average 
value the state takes on a given subdomain and the portion of a subdomain 
where the state exceeds acceptable limits. For the sake of generality, we do 
not consider the question of constraint qualifications, as these can be highly 
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dependent on the type of PDE being considered as well as specifics for the 
state constraint. In the same vein, we consider problems with discontinuous 
data in the state constraints, further suggesting a focus on primal methods. 

For the case of infinite dimensional problems there have been some stud- 
ies on primal methods relying on penalization and smoothing for example 
in [1'6\ [TT[ [n\ . The method we use here can be contrasted with an approach 
presented in [38] for finite-dimensional linear-quadratic problems and ex- 
tended in [13] to the infinite dimensional case. This approach is based on 
smooth approximations to the exact /i-penalty function and has also been 
studied in |1H [12] for a finite number of constraints and other choices of 
smoothing and penalization update. In the finite-dimensional case, conver- 
gence for the smoothed penalty approach has been discussed in |9] under 
the assumption of MFCQ. This has been extended to infinite dimensions 
in [13] for the assumption of a strictly feasible point 1.14]. Recent publica- 
tions |32l [22l [181 EHj treat other regularization strategies not based on exact 
penalization. 

Other existing methods using primal and dual variables also rely on non- 
differentiable functions, e.g., the semi-smooth Newton method. Here, New- 
ton's method is applied to the first-order optimality system containing a 
non-differentiable function by reformulation of possible box-type state con- 
straints. The Clarke subgradient information on the reformulated optimality 
system is then used within the descent step of Newton's method. This has 
been treated for example in [191 El El [30] . 

Our goal is to provide a flexible framework for a variety of state constraints 
as opposed to focusing on methods tuned to a specific type of constraint 
(such as box constraints). The constraints investigated are by no means 
exhaustive but do provide an idea on the procedure in obtaining optimal- 
ity conditions for other state constraints. The arising numerical methods 
may not be as efficient in the case of box constraints when compared with 
methods mentioned above. However, as shown below, the framework gives 
a reasonably efficient procedure, at least anecdotally, when compared with 
black box and smoothed over-penalization methods (i.e. [TH]). After giving 
in Section [2] the description of the problems under consideration, we move to 
introducing the necessary nonsmooth calculus in Section [3} We then present 
the relevant necessary optimality conditions for the considered problems in 
Section |4] before providing a brief description of the descent algorithm possi- 
ble with these conditions. Numerical examples follow in Section [6] for elliptic 
problems with box constraints and weighted integral constraints with a brief 
description of the performance of the algorithms when compared with black- 
box algorithms as well as the smoothed over-penalization method mentioned 
above. 
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2. Problem Description 

Let r2 C M" be a convex, bounded domain, a function ip G L^(r2), and 
scalar a > be given. The theory that follows applies to general cost 
functions; specifically, we only require subdifferential regularity (discussed 
in Section |3]) and Lipschitz continuity of the reduced cost functional defined 
below for many of the results. However, for notational simplicity, we focus 
on minimizing the standard tracking functional J : L^(i7) x L^(r2) — )• M 
given by 



12' 



subject to 

(1) Sq = ^|;, 

where £ : L^(J7) — )• L^(r2) denotes the operator mapping a control q to the 
state ip which solves the relevant PDE. We restrict our attention to problems 
where the solution operator £ is linear and bounded; however, much of the 
subsequent analysis requires only the strict differentiability of this operator 
to hold. We consider the following three example constraints; in each we 
have as data constants a < b: 

• Box constraints: We require that the solution to the relevant PDE 
satisfy the standard box constraints a < ip{x) < h for almost all 

X ^n. 

• Weighted Integral Constraints: We instead require the solution 
to satisfy 

a < (u;,V')l2(q) < b 
for a given function w G L'^{Q,). This is motivated by the requirement 
in radiotherapy that a given portion of the body be given an average 
dose [3^. 

• Resistivity constraints: We also consider the case where the con- 
straint is pointwise, but of a (possibly) nonconvex differentiable op- 
erator applied to the state; i.e. 

a < RW{x) < b 

for almost all x £ i}. For the subsequent analysis, we require R be 
in the Frechet sense. 
We give here a more concrete example: In Section |6j we consider 
problems where £ solves Poisson's equation. If we view the solution 
as the steady-state temperature of a metal under heating/cooling by 
a source q, we will require that the electrical resistivity of the metal 
remain in a prescribed interval. Using the Bloch-Griineisen formula 
[1], we require that at every point, the resistivity R{^p) G L^(J7) lies 
in the interval [a, b] where 
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Here, K and Q are material dependent parameters. 
• Total coverage constraints: A further type of constraint is the 
following. Given < a < 6 and < c < 1, and a compactly 
contained closed subset of nonempty interior Z C il, we require that 
the state satisfy 

a < ip{x) < b 

for at least the fraction c of the total area in Z. In other words, we 
require 

C\Z\< / l[a,b]{i^{x))dx 

Jz 

where ^[a.b] is the standard characteristic function associated to the 
interval [a,b]. This example can be seen as requiring that a critical 
region has a minimal/maximal level of whatever the state represents. 
It is also motivated from radiation therapy, where one may require 
that a sufficient portion of a critical organ not receive a radiative 
dose above a given safe level as discussed in [33| . 

This list is not exhaustive; for instance, we can also constrain-in a com- 
bination of the second and third examples, {w, Rip)- as well as encoding 
constraints on the maxima of several quantities dependent on the state. 
The analysis that follows can be extended to much more general constraints 
as long as the appropriate operations acting on the state are either direc- 
tionally Lipschitz or strictly differentiable (as applicable in the calculus rules 
described below). We also note that these requirements often do not involve 
second-order differentiability. 

For each problem, we will use the standard indicator function Is ■ X ^ 
R U {+00} associated with a set S C X for a Banach space X : 



I six) 



0, xe S 
+00, X ^ S. 



Naturally, this is a convex function when the set S" is a convex set. This also 
means that it is convex when S is convex. Equipped with this, we create for 
each type of constrained problem an augmented reduced-cost functional. For 
the case of the box constrained problem, we look to minimize the function 
Jb : M U {+00} given by 

(2) jBiq) := l\\£{q) -M + ^Mll + J^ I[aM {£q) {x)dx 

where the last term takes values of either or +00 depending on if the 
state violates the constraint on a set of positive measure. As opposed to a 
standard Moreau-Yoshida regularization, the penalization here means that 
the minimizer of jb is also a feasible minimizer of the original constraints. 
Similarly, the weighted integral constrained problem involves minimizing 

(3) jwi{q) := \mq)-W2 + ^\\q\\l + /[a,6] ((^, ^^9))- 
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Next, we consider the resistivity constrained problem as minimizing 

(4) 3R{q) := \ 1 1^ (g) - V^l li + f I kl |^ + ^ I[aM {R{£q)) z. 

Finally, the total coverage constrained problem is rewritten as minimization 
of 

(5) hciq) ■=^\\£{q)-i'\\l + ^\\q\\l + iii-c)\z\ ( i(-oo,a)u(6,oo) q) dx^ 

which we write in this manner so that the integrand in the penalization term 
is lower semicontinuous. We note that Jb and jiyj are convex functionals 
while jn, jTC are not. We assume that the choice of a and h guarantees that 
there is at least one admissible control such that the resulting state satisfies 
the appropriate constraints. 

3. Tools from Nonsmooth Analysis 

In this section we recall the Clarke subdifferential and some of the related 
calculus rules which are proven in [6] and [7]- As our focus is on (possibly) 
nonconvex, extended- value functions on we construct the Clarke sub- 

differential from the related proximal subdifferential, which is defined in a 
Hilbert space setting. 

Definition 3.1. Let S C X be a closed subset of a Hilbert space X. Then 
we say that, for x ^ S, the vector x — s is a proximal normal direction to 
S at s ^ S iff {s} C S" n (x) and S n (x) = 0. The collection 

of all nonnegative multiples ^ = i{x — s), t > where x — s is a proximal 
normal direction at s € S is denoted N^{s) and called the proximal normal 
cone to S at s. Equivalently, the proximal normal cone to S at x is the set 
of vectors C such that for any 5 > 0, there is a a^^x > such that 

{C,y- x) < o-^,a;||y - x\\^, \/y £ Sn Bs{x). 

Definition 3.2. Let / : X — )• MU{-|-oo} be a lower semicontinuous function 
and let f{x) < oo. Then the vector Q £ X is a proximal subgradient at x if 

(C,-l)GiVf,,/(x,/(x)) 

where epi f denotes the epigraph of f . The collection of all such C, is denoted 
as dpf{x) which we call the proximal subdifferential. Equivalently, ( £ 
dp fix) if and only if there exists a^^^x > such that 

f{y) > f{x) + {C,y - x) +(T(;,x\\y - x\\'^,\/y G Bsix). 

From this, we construct the Clarke subdifferential and normal cone: 

Definition 3.3. Let / : X — t- M U {-|-oo} and x G X be as in the above 
definition. Furthermore, assume f is Lipschitz. We define the Clarke sub- 
differential, denoted by dcf{x) as the closed convex hull of the set 

difix) := {C|3xi ^ x,Ci G dpf{xi),Ci ^ Q. 
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Definition 3.4. For a closed set S, We denote by Ng{x) the closed convex 
hull of the set of vectors obtained by taking weak limits of sequences of Q E 
Ng{xi) where Xi € S and Xi — t- x. 

This definition to the Clarke subdifferetial, discussed in the finite dimen- 
sional case in [36] and in the general Hilbert setting in [7j, is equivalent to 
the definitions in [6j. However, at times we will make use of the proximal 
constructions in the presence of non-Lipschitz data. We note that one can 
extend the Clarke subdifferential to non-Lipschitz functions but is not nec- 
essary for our analysis. Finally, we will make use of Clarke's generalized 
directional gradient: 

Definition 3.5. Let f : M" — t- M and x £ M". Then the generalized direc- 
tional gradient of f at x in direction v is given as 

f{y + tw) -r 



f°{x;v) :=limsup inf 



iy,r)lfx t 



where {y,r) x means y — t- x, r — )• f{x) with {y,r) £ epif . 

We also note that in the case of convex functions (respectively, sets), the 
proximal and Clarke subdifferentials (respectively, normal cones) coincide 
with the usual notions from convex analysis. Furthermore, if x* locally 
minimizes /, then G dcf{x*); the converse holds if / is convex: if E 
dcf{x*), then x* is a global minimizer (see, for instance. Proposition 4.3 of 
Chapter 2 of [7]). 

In order to make use of the calculus rules for the Clarke subdifferential, 
we recall two additional definitions. 

Definition 3.6. For two Banach spaces X, Y we say F : X ^ Y is strictly 
differentiable at x if, there exists DsF[x) € L[X, Y) such that for any v 

r F{y + tv)-F{y) p. . > 

and that the convergence is uniform for v in compact sets. 

This is a Hadamard-type strict derivative. Similar versions exist for 
Gateaux-type and Frechet-type strict differentiability. We will only focus on 
Hadamard-type strict derivatives for the remainder of the paper, and thus 
only refer to such D^F as the strict derivative. 

Definition 3.7. Let / : X — )• MU{-|-oo} be an extended real-valued function. 
We say f is directionally Lipschitz at x with respect to v G X if f{x) < oo 
and 

f{y + tw)-r 
lim sup 

{y,r)-lfX,w—^v,H,0 ^ 

is finite. We say f is directionally Lipschitz at x if f is directionally Lips- 
chitz at X with respect to at least one v. 
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Theorem 1 of |36] states, in particular, that a lower semicontinuous func- 
tion / : X — )• M, for some Hilbert space X, is directionally Lipschitz at 
X if /(x) is finite and Lipschitzian at x. Alternatively, and perhaps more 
useful in this context, the function is directionally Lipschitz if it is convex 
and bounded on an open set (not necessarily containing the point under 
consideration) . 

Equipped with these two definitions, we state calculus rules for Clarke 
subdifferentials when applied to extended-value functions. For proofs, we 
refer the reader to |6]. We also note that more complex and weaker rules 
exist for the proximal subdifferential, which may be required in the future 
when dealing with more general constraints (such that the associated cost 
functions are merely lower-semicontinuous) . 

Lemma 3.1. Let f = g o F where F : X ^ Y is a strictly differentiable 
map between Banach space X and Hilbert space Y and 5 : y — )■ M U {-|-oo}. 
If g is finite and directionally Lipschitz at F{x) with 

DsF{X) n int{v : g°{F{x); v) < 00} ^ 0, 

then we have 

dcf{x)<^DsF{xYodcg{F{x)). 

Equality holds if g is a convex, extended value function or is in the 
Frechet sense. 

Proof. This is proven as Theorem 3 of f36]. □ 

The condition for equality is a specific case of subdifferential regularity, 
termed by Rockafellar in [36]. There are more general conditions where 
equality can hold, as discussed in that work; another possibly salient char- 
acterization is the case where g is the indicator function of a set whose 
tangent and contingent cones coincide. However, for our purposes here, we 
will not need them. We also will make use of the following sum rule proven 
as Corollary 2 of j36| . 

Proposition 3.1. For any finite collection of fi : X — )• M, where X is a 
Banach space, and a point x where all fi{x) < +co, we have if all but one 
is Lipschitzian at x that 

^c(Zf^{x))<^Y.^cf^{x). 

If all these functions are convex, extended-valued or in the Frechet sense, 
equality holds. 

We note that for every x £ S, we have: 

dclsix) := N^{x). 

and that if Is{x) < 00 and S has nonempty interior, then Is is directionally 
Lipschitz at x, as noted above. Also, as noted in [36j, if S is convex. Is is 
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subdifferentially regular wherever it is finite. Therefore, in particular, we 
have 

if a < X <b 

dcl[a,b] (x) = A^[a,fc] (2;) = < (-00, 0] if a; = a 

_ [0, +00) iix = b 

Again, this is not the strictest set of conditions for the subdifferential reg- 
ularity of Is, but they suffice for our purposes here. Thus, if we have con- 
straints requiring the result of a strictly differentiable operator applied to 
the state remains in a convex set with nonempty interior of a Hilbert space, 
we may make use of the above calculus rules with equality instead of merely 
inclusion. 

Finally, we note the following fact about descent directions and subdif- 
ferentials. 

Proposition 3.2. Let f : X ^ MU{-|-cxd} be Lipschitz and f{x) < 00. Then 

— argmin 

is a descent direction at x. 
Proof. (Sketch) We have 

— argmin ||CII = — argf min max(C,d)) 
CGdcfix) Ce9c/WIMII<i 

= — argf max min (C.d)) 

\\d\\<ic&dcf{xy 

= — argf max min (C,—d)). 
IHI<iCe%/W 

= argf min max (C,d)). 

M\<i (edcfixy 

which means that the directional derivative is minimized, as the directional 
derivative in direction d is equal to 

max (C:d). 
C&dcfixy 

□ 



4. Optimality Conditions 

With the above calculus rules, we turn to deriving conditions for the 
optimality of a control in the context of the three problems discussed in 
Section [2| We note that, as £ is linear and bounded, the first two terms of 
JB, jwii and jn are twice continuously Frechet differentiable. We denote 
by £* : — )• L'^{Q) as the adjoint solution operator. In the case of the 
box constrained problem, we may use the chain rule and sum rule to obtain 
the following. 
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Proposition 4.1. The function q* G L'^{Q) is a global minimizer of Jb if 
and only if 

G {£*{£q* - V^) + aq} + £* (iVg,] {£q*)) ■ 

Remark 1 . The last term is interpreted in the following way: At every point 
a; G J7, we have a set of normal vectors to the set [a, b] at the value {£q*){x) 
We collect the set of selections from the resulting set and apply the 
adjoint solution operator to these selections. 



Proof. We note that 



2 



2" " ""^ 2 

is a convex functional of q as is the extended- value J^I[a,b]{^Q)^ meaning 
that js is convex. Therefore, q* is a minimizer if and only if G dajsiQ*)- 
By the sum rule — and noting that the first two terms in js are and 
therefore Lipschitz and everywhere finite — we have that 

dcjB{q)^dc \\\£q-'^\\l + '^M\l +dc j^I[a,b\[£q)dx. 

As we note that the integral term is a linear operator composed with a con- 
vex operator acting on the control, the subset containment above is equality 
wherever the function jb is finite. The second step requires the characteri- 
zation of the subdifferential of an integral operator. For a general treatment 
of this in the case of finite- valued Lipschitz integrands, we note Theorem 
5.18 of [7]; for the special case where the integrand is an indicator function 
of a closed convex set composed with a smooth operator, we note Theorems 
2 and 3 of ^J. In the latter case, we have 



dc 



I[a,b]i£q)i^)dx = / dcI[a,b]{Sq){x)dx 
Jn 



In order to make use of the chain rule, we note that wherever £q{x) = b, 
and if v{x) < 0, then I°^^^{£q{x);v) < oo (similarly for when £q{x) = a). 
And so we may apply the chain rule for linear £* , the case here. And so we 
have 

dc{l[a.b]{eq))=£*{N^:,b]{£q)). 
Thus we have the subgradient characterization of the minimizer. □ 

Similarly, we characterize the global minimizers for the weighted integral 
constrained problem: 

Proposition 4.2. The function q* G L^(r2) is a global minimizer of jyyi if 
and only if 

G {£*i£q* + aq*} + (n^^^,^{{w, £q*)) • £*{w). 
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Proof. The arguments in the previous proof regarding the necessity and 
sufficiency of S dcjwi {q* ) hold here as well, as does the application of the 
sum rule. The primary difference here is the chain rule calculation of the 
penalty term. We note the Clarke subgradient of this term is a dual element 
which can be written in weak form, for a test function ^p G as 



dc(.I[a,b](.{w,Sq*)))i^dx 



Ds[{w,£{q*))yoN^^,^{{w,£q*)) 
£*{wN^;^,^{{w,£q*))Hx 
£*{w)-N{i^,^{{w,£q*mdx 



tpdx 



which provides the remaining ingredient for the proof. 



□ 



We note that jn is a nonconvex functional of the control and so we only 
obtain the familiar necessary condition for optimality; the proof is the same 
as in Proposition |4.1| : 

Proposition 4.3. // the function q* G is a local minimizer of jr, 

then 

G {£* {£q* -^)+aq*}+£*oR'o iVf, {R{£q*)) 

where R' is the normal Frechet derivative associated with the point tp* and 
is being applied to selections from the normal cone at each point x G 0. 

Finally, under additional assumptions on £ and the optimal control q* , 
we have the necessary optimality conditions for jtc '■ 

Proposition 4.4. Suppose that the function q* G -L^(il) is a local minimizer 
of he, and 

(1) Z CC has boundary and nonempty interior, 

(2) £ is surjective on the space of extensions of functions in C^{Z), 

(3) £{q*) is continuous on Z. 
Define f : — )• L^(r2) given by 



fiQ) ■■- 



[1 



(— oo,a)U 



{b,oo){£q)]{x)dx 



either 



f{q*) < (1 - c)\Z\ and -aq* = £*{£q* - ip), or 

fil*) = (1 ~ c)|Z| and for any desired accuracy e > 0, there exists 

q, s.t. (qeJiqe) S {r* , f{q*)) + B„ with 

G I3{£*i£q* -i^) + aq*}+ Bf + jdpf{q,) 

where 

' (-1, 0] • [£*{l^\[a,b]£<le)] (x) if [£q,]{x) = 6, 
[0, 1) • [£*{l^\[,^b]£qe)] (x) if [£qe]{x) = a, 
else, 



dpf{qe){x) 
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and /3 + 7 = 1 and Bf denotes the e-ball in the weak topology of 

Remark 2. The assumption on the image £ is motivated by the use of the 
method of manufactured solutions in apphcations. In such apphcations, one 
may test solution methods by applying the relevant differential operator to 
a constructed solution in order to find a relevant control/source. It is easy 
to find examples where it holds; one example is where £ is associated to the 
Poisson problem with distributed source: 

—Ai/j = q. 

If we instead have £ associated to the Poisson problem with boundary con- 
trol, 

-A^p = 0, ^p\^n = q, 

then clearly, the relevant assumptions are not satisfied in the most general of 
settings. We also are not sure how strict the assumption on the continuity of 
the optimal state is in a general setting. However, it is possible that further 
information on the specific form of Z and $7 will allow for this assumption 
to be unnecessary. 

Remark 3. The "fuzzy" nature of this rule may seem an obstacle at first 
glance to implementation. However, if we discretize the problem with ap- 
propriate error bounds, we may require e to be sufficiently small to be dom- 
inated by the discretization error. Furthermore, the "fuzzy" nature of the 
optimality condition can be improved in the presence of constraint qualifi- 
cations being satisfied. See [1] for a survey of some appropriate constraint 
qualifications. 



Proof, (of Proposition 4-4^ For simplicity, we only consider the case where 
a = — oo. Then we have 



f{q) := / [l^b^^){£q)]{x)dx 
Jz 

Then if f{q*) < (1 — c)\Z\, it is clear that the result holds as the constraint 
is inactive and we recover the classical optimality conditions. Assume that 
f{q*) = (1 — c)\Z\. We know that the following set has positive measure: 

Z'' ■.= {xe Z : £q*{x) G (6,oo)} 

and thus contains a nonempty open set. So there exists an open set of non- 
negative smooth functions {^7} with support contained in Z^ . By assump- 
tion there is a set of corresponding controls such that (f)^ = [£py)\z- By 
construction, f{q + tq-y) = f{q) for all t > 0. Therefore there is an open set 
of directions such that f at q is directionally Lipschitz along them. 

Then, we note that if C ^ dpf{q) then almost everywhere in we have 
that 

C{x) € £:*(i(f,^oo)'^^) ■ [dp'^{b,oo)£Q(.x)]; 
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where we note that differentiation under the integral is allowed for proper 
lower semicontinuous integrands in L'^{Q) as discussed in [35] and [21] and 
that, as above, £ is regular in the sense of the Clarke. It is immediately seen 
that 

fl if[£q]{x) = b, 



else. 



The rest of the proof follows from a lemma which is the following particular 
case of Theorem 2.14 from [3]. 

Lemma 4.1. Assume that q* is a local minimizer of Jtc- Then for any 
number e > 0, and weak neighborhood V of 0, there exists q such that 
{qJ{q))^q\f{q*))+eB and 

Oe^ri£{q*-i^))+jdpf{q) + V 

where ^ + 7 = 1. 

The referenced Theorem is in a much more general setting, and relies 
on a "fuzzy" sum rule (see also [7J for a discussion of such rules). Here 
we take advantage of the smoothness of the tracking functional to reduce 
the "fuzziness" of that rule to some extent. From there the result follows 
immediately. □ 



5. Optimization Algorithms 

5.1. Subgradient descent method. Throughout this section we consider 
a steepest descent algorithm for solving the problems by noting that the 
element subgradient of minimal norm is a direction of descent. 

Algorithm 5.1. We initialize with a feasible point q^^\ 

(1) At q^^\ we calculate the state ip^^^ := £{q^^'') and the classical adjoint 

(2) We then minimize ^\\p + A'-'^-* + ag*^'^-*||| + f ||/o||| over p and label 
the minimizer p^^^ for a very small ^ > 0. If the minimum norm is 
below tolerance, STOP. The minimization is constrained depending 
on the problem: 

(a) For the box constrained problem, p{x) G N^a,b]i'4^^'''^){x): 

(b) For the weighted integral constrained problem, p{x) = r-£*{il:^^^) 
for re N^a,b\{{wA^^^)), 

(c) For the resistivity constrained problem, p{x) G £*oR'oN^^^^{R{'iIj^'''^)){x). 

(3) Perform an Armijo line search along the direction d^^"^ := —p^^'' + 

_(y^q{k) obtain q^^~^^\ If resulting step size is below tolerance, 
STOP. 

For the total coverage constrained problem, we consider for notational 
simplicity only the case where we have discretized the problem. Then the 
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minimization in step (2) is performed only if the constraint is active at the 
current iterate and is taken over 

Ph{x) G -[f;:(iK\MV'f )](x) • iVM(V'f ) 

where we use the subscript h to emphasize this is taken in the discrete setting 
(that is, £h is the discretized adjoint operator). 

Note that the direction search is only performed if the constraint is active 
and that for the case of the weighted integral constraint, it is an additional 
line search. In the case of the box constrained and resistivity constrained 
problems, the search is of dimension equal to the number of points where 
the constraint is active. Thus, the number of function evaluations in this 
case will increase as the mesh is refined, in general. 

5.2. BFGS Implementation. Alternatively, we also use a BFGS-type al- 
gorithm for solving weighted integral problems. 

Algorithm 5.2. We initialize with a feasible point q^^\ and = I. 

(1) At q^^\ we calculate the state := 8{q^^'>) and the classical adjoint 

(2) We then minimize \\\p + \^^^ -\- aq^^^\\2 + ^\\p\\2 over p and label the 
minimizer p^^^ for a very small /3 > 0. If the minimum norm is below 
tolerance, STOP. Here, p{x) = r ■ £*{ip'^^^) for r G N[a,h]{{w,ip''''^)), 

(3) Define d^^) := p^''^ + A^^) + aq^^'l Update H^^^ via standard BFGS 
update (such as in 134J or \23\). 

(4) Perform a Armijo line search along the direction {H^^^)~^d^^^ to 
obtain q^''^^^ If resulting step size is below tolerance, STOP. 

The method has anecdotally shown good performance, as described in the 
following section, but we have not established that the algorithm works in 
the most general setting. We make this comment as this method fails in the 
case of box constraints for the numerical examples in Section [6j Issues, for 
consideration of implementing quasi-Newton type methods for nonsmooth 
functions have been discussed in greater detail in [26]; our problems do not 
in general satisfy the conditions listed there. In fact, there are relatively 
simple examples involving exact penalization where an implementation of 
the above BFGS method will fail. Specifically, problems arise when the 
algorithm finds consecutive points at which the function under consideration 
is nondifferentiable. We briefly present one such instance outside of the 
context of PDE-constrained optimization. 

Example 1. Let S* C be the closed £^ unit ball and f{x,y) := (x — 
1)^ + (y — If we begin at the point x^^^ = (—0.5, —0.25)"^, with initial 
H^^^ = I, then an inexact line search along —II^^Vf{x^'^^) results in x^^^ = 
(0.6207,0.3793)"^ and search direction (-0.2393,0.4363)"^; this results in a 
line of infeasible points for the line search. We note that the element of 
the subgradient Vf{x^^^) + N^{x^^^) of approximately minimal norm in this 
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case, (-0.3691,0.3691))^ results in a feasible search direction. In light of 
this, we do not use the BFGS method for box constrained problems. It is 
possible that with appropriate modifications, this can be improved to handle 
such issues. 

6. Numerical Examples 

In this section, we focus on test problems where the solution operator E is 
associated to solving the Poisson equation subject to homogeneous Dirichlet 
boundary conditions. 

-AV' = q 
i'\dQ. = 0. 

6.1. Mesh dependence. We test the box constrained algorithm where the 
desired state is 

V^i := £:(e-l"l'/4) 

where O the unit circle; it is then rescaled into a unit function. In the 
case where a = 1.0365, the constraint is active on a set of measure zero and 
we have a residual error ||^* — V'lli ~ 3.5678e — 04 for a mesh with 1445 
elements. 

We use l.e — 5 as tolerance for both convergence tests described above and 
set a = .001. We first consider the box constrained problem with an upper 
bound of 0.8 (the lower bound is not considered) and vary the mesh. We 
compare against a standard SQP algorithm with classic gradient information 
as a benchmark in Table[T| results are given for meshes with varying numbers 
of elements and fixed target ipi and upper bound 0.8. The total number of 
PDE solves includes calls to evaluate the functional during line searches 
along the descent direction. The steepest descent method converges after 
relatively few evaluations of the PDE solver, offering a possible alternative to 
black box algorithms in applications where the solver is relatively expensive 
compared with other portions of the algorithm. We note that the bulk 
of the calls to the PDE solver occur in the final iterations' subgradient 
minimization. 

A further breakdown of the work done in the steepest descent algorithm 
is shown in Table [2] along with a comparison with results from the smoothed 
penalty algorithm of ^15j. We use a fixed, relatively coarse mesh and vary the 
severity of the constraint. We show number of iterations until convergence, 
the number of outer iteration calls to the PDE solver (combining line search 
calls with calls for the current state and classical gradient), and the required 
adjoint equation solves required for the subgradient minimization. We note 
that this compares favorably, at this level of mesh refinement, with the 
smoother penalty algorithm with regards to iterations for convergence or 
calls to the PDE solver. One result of minimizing js is shown, compared 
against the target, in Figure [TJ 
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Inf. Pen. 


SQP 


Elem # 


Opt. Cost 


Iters. 


PDE Solves 


Opt. Cost 


Iters. 


PDE Solves 


377 


0.0113 


6 


2620 


0.024 


20 


7764 


541 


0.0119 


6 


3310 


0.0089 


97 


54304 


749 


0.0066 


6 


5808 


0.0099 


29 


22005 


1445 


0.0067 


9 


12001 




43 


64110 


2109 


0.008 


14 


34709 




46 


99646 


Table 1. Mes 


1 dependence for the box constrainted prob- 



lem using steepest descent and infinite penalization; compar- 
ison with standard SQP (maximum of state is 0.8) 





Inf. Pen. 


Smoothed Penalty 








PDE Solves 




PDE Solves 


Max. of il^ 


Opt. Cost 


Iter. 


Outer 


Subgr. 


Iters. 


Outer 


Inner 


0.2 


0.3093 


5 


223 


5246 


28 


139 


4753 


0.3 


0.2388 


5 


231 


3908 


26 


205 


6192 


0.4 


0.1731 


5 


237 


2952 


26 


153 


5755 


0.6 


0.0483 


6 


223 


5246 


38 


188 


9349 


0.8 


0.0066 


6 


561 


5247 


22 


118 


5737 



Table 2. Work done by infinite penalization(outer itera- 
tions, line search evaluations, subgradient minimization eval- 
uations) from varying box constraint levels versus work done 
by Smoothed Penalty algorithm on mesh with 749 elements 




(a) Target with upper bound (b) Result of infinite penalization 

Figure 1. Box constrained target and result from gradient 
descent using infinite penalization 



We also test the weighted integral constrained problem with the same tar- 
get, tolerance and regularization parameter a, and w as the characteristic 
function associated with the 0.25 radius ball using the BFGS method of Sec- 
tion 5.2, The upper bound is set to 0.12. As a reference, (u^, ■0)L2(r2) ~ 0.1817 
in this example. By encoding the constraint via infinite penalization, we 
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Inf. Pen. 


SQP 


Elem #. 


Opt. Cost 


Iters. 


PDE Solves 


Opt. Cost 


Iters. 


PDE Solves 


377 


0.0167 


6 


446 


0.021 


14 


5330 


541 


0.0263 


4 


283 


0.0262 


14 


7641 


749 


0.0197 


6 


440 


0.0220 


14 


10554 


1105 


0.0217 


4 


281 


0.0316 


10 


11099 


1445 


0.0209 


4 


268 


0.0225 


14 


20300 


2109 


0.0212 


4 


284 


0.0226 


9 


19023 



Table 3. Weighted integral problem mesh dependence comparison 



achieve significant speedup with the number of PDE solves being indepen- 
dent of mesh while capturing the minimum more closely than the SQP. One 
such result is displayed in Figure [2| along with a contour plot of the result 
to show the effect of the constraint. We repeat this for a problem where 
the source is the same, however the mesh is an L-shaped domain, w is the 
the characteristic of [—0.5, 0.5] x [—0.25, —0.75] and the upper bound for the 
average value is set as ^{w,ip) x^2(^q. The relative counts for iterations and 
PDE solver calls are similar to those of the previous example; thus, we show 
the residual for the results of infinite penalization and SQP compared with 
the boundary of the support of w along with the optimal result from infinite 
penalization. 

7. Concluding Remarks 

We have considered a collection of primal-based methods in the absence 
of constraint qualifications for deriving necessary optimality conditions for 
optimal control problems involving PDEs whose solution operator is linear 
and bounded. The framework used is rather flexible, and can be extended 
to cases where the operator £ is strictly differentiable, as opposed to lin- 
ear and bounded. Furthermore, other constraints can be implemented, as 
long as they satisfy the appropriate regularity in order to make use of the 
appropriate chain rule. Using these conditions, we implemented a steepest 
descent algorithm and, in one case, a BFGS-type algorithm and demon- 
strated performance improvements over a black box SQP solver, especially 
in the case of the weighted integral constrained problem. At this stage, it 
is unclear to what extent fully optimizing the norm of the subgradient is 
required for descent; also it may be possible to use nonsmooth second-order 
rules to develop a robust Newton-like method. 
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